Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Surprisal theory posits that less-predictable words should take more time to process, with word predictability quantified as surprisal, i.e., negative log probability in context. While evidence supporting the predictions of surprisal theory has been replicated widely, much of it has focused on a very narrow slice of data: native English speakers reading English texts. Indeed, no comprehensive multilingual analysis exists. We address this gap in the current literature by investigating the relationship between surprisal and reading times in eleven different languages, distributed across five language families. Deriving estimates from language models trained on monolingual and multilingual corpora, we test three predictions associated with surprisal theory: (i) whether surprisal is predictive of reading times, (ii) whether expected surprisal, i.e., contextual entropy, is predictive of reading times, and (iii) whether the linking function between surprisal and reading times is linear. We find that all three predictions are borne out crosslinguistically. By focusing on a more diverse set of languages, we argue that these results offer the most robust link to date between information theory and incremental language processing across languages.more » « less
-
When a language offers multiple options for expressing the same meaning, what principles govern a speaker’s choice? Two well-known principles proposed for explaining wideranging speaker preference are Uniform Information Density and Availability-Based Production. Here we test the predictions of these theories in a previously uninvestigated case of speaker choice. Russian has two ways of expressing the comparative: an EXPLICIT option (Ona bystree chem ja/She fast- COMP than me-NOM) and a GENITIVE option (Ona bystree menya/She fast-COMP me-GEN). We lay out several potential predictions of each theory for speaker choice in the Russian comparative construction, including effects of postcomparative word predictability, phrase length, syntactic complexity, and semantic association between the comparative adjective and subsequent noun. In a corpus study, we find that the explicit construction is used preferentially when the postcomparative noun phrase is longer, has a relative clause, and is less semantically associated with the comparative adjective. A follow-up production experiment using visual scene stimuli to elicit comparative sentences replicates the corpus finding that Russian native speakers prefer the explicit form when post-comparative phrases are longer. These findings offer no clear support for the predictions of Uniform Information Density, but are broadly supportive of Availability- Based Production, with the explicit option serving as an unreduced form that eases speakers’ planning of complex or lowavailability utterances. Code for this study is availablemore » « less
-
Contemporary autoregressive language models (LMs) trained purely on corpus data have been shown to capture numerous features of human incremental processing. However, past work has also suggested dissociations between corpus probabilities and human next-word predictions. Here we evaluate several state-of-the-art language models for their match to human next-word predictions and to reading time behavior from eye movements. We then propose a novel method for distilling the linguistic information implicit in human linguistic predictions into pre-trained LMs: Cloze Distillation. We apply this method to a baseline neural LM and show potential improvement in reading time prediction and generalization to held-out human cloze data.more » « less
-
Understanding a gradable adjective (e.g., big) requires making reference to a comparison class, a set of objects or entities against which the referent is implicitly compared (e.g., big for a Great Dane), but how do listeners decide upon a comparison class? Simple models of semantic composition stipulate that the adjective combines with a noun, which necessarily be- comes the comparison class (e.g., “That Great Dane is big” means big for a Great Dane). We investigate an alternative hypothesis built on the idea that the utility of a noun in an adjectival utterance can be either for reference (getting the listener to attend to the right object) or predication (describing a property of the referent). Therefore, we hypothesize that when the presence of a noun N can be explained away by its utility in reference (e.g., being in the subject position: “That N is big”), it is less likely to set the comparison class. Across three pre-registered experiments, we find evidence that listeners use the noun as a cue to infer comparison classes consistent with a trade-off between reference and predication. This work highlights the complexity of the relation between the form of an utterance and its meaning.more » « less
An official website of the United States government

Full Text Available